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Abstract. Predicate abstraction is a key enabling technology for applying finite- 
state model checkers to programs written in mainstream languages. It has been 
used very successfully for debugging sequential system-level C code. Although 
model checking was originally designed for analyzing concurrent systems, there 
is little evidence of fruitful applications of predicate abstraction to shared- variable 
concurrent software. The goal of this paper is to close this gap. We have de- 
veloped a symmetry-aware predicate abstraction strategy: it takes into account 
the replicated structure of C programs that consist of many threads executing 
the same procedure, and generates a Boolean program template whose multi- 
threaded execution soundly overapproximates the concurrent C program. State 
explosion during model checking parallel instantiations of this template can now 
be absorbed by exploiting symmetry. We have implemented our method in the 
SatAbs predicate abstraction framework, and demonstrate its superior perfor- 
mance over alternative approaches on a large range of synchronization programs. 



1 Introduction 

Concurrent software model checking is one of the most challenging problems facing 
the verification community today. Not only does software generally suffer from data 
state explosion. Concurrent software in particular is susceptible to state explosion due 
to the need to track arbitrary thread interleavings, whose number grows exponentially 
with the number of executing threads. 

Predicate abstraction [1] was introduced as a way of dealing with data state ex- 
plosion: the program state is approximated via the values of a finite number of predi- 
cates over the program variables. Predicate abstraction turns C programs into finite-state 
Boolean programs [2], which can be model checked. Since insufficiently many predi- 
cates can cause spurious verification results, predicate abstraction is typically embed- 
ded into a counterexample-guided abstraction refinement (CEGAR) framework [3]. The 
feasibility of the overall approach has been convincingly demonstrated for sequential 
software by the success of the Slam project at Microsoft, which was able to discover 
numerous control-dominated errors in low-level operating system code [4]. 

The majority of concurrent software is written using mainstream APIs such as 
POSIX threads (pthreads) in C/C++, or using a combination of language and library 
support, such as the Thread class, Runnable interface and synchronized construct in 
Java. Typically, multiple threads are spawned — up front or dynamically, in response to 



varying system load levels — to execute a given procedure in parallel, communicating 
via shared global variables. For such shared-variable concurrent programs, predicate 
abstraction success stories similar to that of Slam are few and far between. The bot- 
tleneck is the exponential dependence of the generated state space on the number of 
running threads, which, if not addressed, permits exhaustive exploration of such pro- 
grams only for trivial thread counts. The key to obtaining scalability is to exploit the 
symmetry naturally exhibited by these programs, namely their invariance under permu- 
tations of the participating threads. 

Fortunately, much progress has recently been made on analyzing replicated Boolean 
programs, where a non-recursive Boolean template program is executed concurrently 
by many threads [5-7]. In this paper, we present an approach to predicate-abstracting 
concurrent programs that leverages this recent progress. More precisely: 

- We describe a scheme to translate a non-recursive C program P with shared (global- 
scope) and local (procedure-scope) variables into a Boolean program B such that 
the n-thread Boolean program, denoted B", soundly overapproximates the n-thread 
C program, denoted P". We call such an abstraction method symmetry-aware. 

- Our method permits predicates over arbitrary C program variables, local or global. 
We illustrate below the ramifications of this objective. 

We also show in this paper how our approach can be implemented for C-like languages, 
complete with pointers and aliasing, and discuss the issues of spurious error detection 
and predicate refinement. In the rest of the Introduction, however, we illustrate why 
approaching the above two main goals naively can render abstraction unsound, creating 
the danger of missing bugs, which defies the purpose of reliable program verification. 

A remark on notation: In program listings, we use == for the comparison operator, 
while = denotes assignment (as in C). Concurrent threads are assumed to interleave with 
statement-level granularity; see the discussion in the Conclusion on this subject. 

1.1 Predicate Abstraction using Mixed Predicates 

The Boolean program B to be built from the C program P will consist of Boolean 
variables, one per predicate as usual. Since B is to be executed by parallel threads, 
its variables have to be partitioned into "shared" and "local". As these variables track 
the values of various predicates over C program variables, the "shared" and "local" 
attributes clearly depend on the attributes of the C variables a predicate is formulated 
over. We therefore classify predicates as follows. 

Definition 1 A local predicate refers solely to local C program variables. Analogously, 
a shared predicate refers solely to shared variables. A mixed predicate is neither local 
nor shared. 

We reasonably assume that each predicate refers to at least one program variable. 
A mixed predicate thus refers to both local and shared variables, and the above clas- 
sification partitions the set of predicates into the three categories. 

Given this classification, consider a local predicate <f>, which can change only as a 
result of a thread changing one of its local C variables; a change that is not visible to any 



other thread. This locality is inherited by the Boolean program if predicate (ft is tracked 
by a local Boolean variable. Similarly, shared predicates are naturally tracked by shared 
Boolean variables. 

For a mixed predicate, the decision whether it should be tracked in the shared or in 
the local space of the Boolean program is non-obvious. Consider first the following pro- 
gram P and the corresponding generated Boolean program B, which tracks the mixed 
predicate s ! = I in a local Boolean variable b: 



P: 



0: shared 


int s = 


0; 




0: local bool 6=1; 


local 


int I = 


1; 


B: 




1: assert 


s != 1; 






1: assert b; 


2: + + s; 








2: b = b ? * : 1; 



Consider the program P 2 , a two-thread instantiation of P. It is easy to see that ex- 
ecution of P 2 can lead to an assertion violation, while the corresponding concurrent 
Boolean program B 2 is correct. (In fact, B™ is correct for any n > 0.) As a result, 
B 2 is an unsound abstraction for P 2 . Consider now the following program P' and its 
Boolean abstraction B', which tracks the mixed predicate s == Una shared Boolean 
variable b: 



P': 



0: 


shared 


int s 


= 0; 




0: 


shared 


bool b = 


1; 






shared 


bool t 


= i; 






shared 


bool t = 


i; 






local 


int I 


= 0; 


B': 












1: 


assert 


t -o- (s 


== I) ; 




1: 


assert 


t O b; 






2: 


assume 


t; 






2: 


assume 


t; 






3: 


++1 , t 


= 0; 






3: 


b , t = 


(b!0 


: * ) 


, 0; 



Execution of (P') 2 leads to an assertion violation if the first thread executes P' with- 
out interruption, while (B') 2 is correct. We conclude that (B') 2 is unsound for (P') 2 . 
The unsoundness can be eliminated by making b local in B'; an analogous reasoning 
removes the unsoundness in B as an abstraction for P. It is clear from these examples, 
however, that in general a predicate of the form s == I that genuinely depends on s 
and / cannot be tracked by a shared or a local variable without further amendments to 
the abstraction process. 

At this point it may be useful to pause for a moment and consider whether, instead 
of designing solutions that deal with mixed predicates, we may not be better off by 
banning them, relying solely on shared and local predicates. Such restrictions on the 
choice of predicates render very simple bug-free programs unverifiable using predicate 
abstraction, including the following program P": 



P" 



0: 


shared 


int r = 


0; 




shared 


int s = 


0; 




local 


int I = 


0; 


1: 


++r; 






2: 


if (r == 


1) then 




3: 


/(); 







/(): 



++s, 

assert s 
goto 4 ; 



I; 



The assertion in P" cannot be violated, no matter how many threads execute P, since 
no thread but the first will manage to execute /. It is easy to prove that, over a set of 
non-mixed predicates (i.e. no predicate refers to both I and one of {s, r}), no invariant 
is computable that is strong enough to prove s == I. We have included such a proof in 
Appendix B. 

A technically simple solution to all these problems is to instantiate the template P 
n times, once for each thread, into programs {Pi, . . . , P n }, in which indices 1, . . . ,n 
are attached to the local variables of the template, indicating the variable's owner. Every 
predicate that refers to local variables is similarly instantiated n times. The new program 
has two features: (i) all its variables, having unambiguous names, can be declared at the 
global scope and are thus shared, including the original global program variables, and 
(ii) it is multi-threaded, but the threads no longer execute the same code. Feature (i) 
allows the new program to be predicate-abstracted in the conventional fashion: each 
predicate is stored in a shared Boolean variable. Feature (ii), however, entails that the 
new program is no longer symmetric. Model checking it will therefore have to bear the 
brunt of concurrency state explosion. Such an approach, which we refer to as symmetry- 
oblivious, will not scale beyond a very small number of threads. 

To summarize our findings: Mixed predicates are necessary to prove properties for 
even very simple programs. On the other hand, they cannot be tracked using standard 
thread-local or shared variables. Disambiguating local variables avoids mixed predi- 
cates, but destroys symmetry. It is the goal of this paper to design a solution without the 
need to compromise. 

2 Symmetry- Aware Predicate Abstraction 

In order to illustrate our method, let P be a program defined over a set of variables V 
that is partitioned in the form V = Vs U Vl into shared and local variables. The parallel 
execution of P by n threads is a program defined over the shared variables and n copies 
of the local variables, one copy for each thread. A thread is nondeterministically chosen 
to be active, i.e. to execute a statement of P, potentially modifying the shared variables, 
and its own local variables, but nothing else. In this section, we ignore the specific 
syntax of statements, and we do not consider language features that introduce aliasing, 
such as pointers (these are the subject of Section 3). Therefore, an assignment to a 
variable v cannot modify a variable other than v, and an expression <f> depends only on 
the variables occurring in it, which we refer to as Loc{fa) = {v : v occurs in fa}. 

2.1 Mixed Predicates and Notify-All Updates 

Our goal is to translate the program P into a Boolean program IB such that, for any n, 
a suitably defined parallel execution of IB by n threads overapproximates the parallel 
execution of P by n threads. Let E = {fa, . . . , fan} be a set of predicates over P, i.e. 
a set of Boolean expressions over variables in V. We say fa is 

shared if Loc(fa) C Vs , 
local if Loc(fa) C Vl , and 

mixed otherwise, i.e. Loc(fa) n Vl ^ and Loc(fa) n Vs =/= 0. 



We declare, in B, Boolean variables {61, . . . ,b m }; the intention is that bi tracks the 
value of <pi during abstract execution of P. We partition these Boolean variables into 
shared and local by stipulating that bi is shared if fa is shared; otherwise bi is local. 
In particular, mixed predicates are tracked in local variables. Intuitively, the value 
of a mixed predicate fa depends on the thread it is evaluated over. Declaring bi shared 
would thus necessarily lose information. Declaring it local does not lose information, 
but, as the example in the Introduction has shown, is insufficient to guarantee a sound 
abstraction. We will see shortly how to solve this problem. 

Each statement in P is now translated into a corresponding statement in IB. State- 
ments related to flow of control are handled using techniques from standard predicate 
abstraction [2]; the distinction between shared, mixed and local predicates does not 
matter here. Consider an assignment to a variable v in P and a Boolean variable b 
of B with associated predicate <p. We first check whether variable v affects fa written 
affects(v, fa). Given that in this section we assume no aliasing, this is the case exactly 
if v 6 Loc(4>). If affects(v, fa) evaluates to false, b does not change. Otherwise, code 
needs to be generated to update b. This code needs to take into account the "flavors" of 
v and fa which give rise to three different flavors of updates of b: 

shared update: Suppose v and <p are both shared. An assignment to v is visible to all 
threads, so the truth of (p is modified for all threads. This is reflected in B: by our 
stipulation above, the shared predicate cp is tracked by the shared variable b. Thus, 
we simply generate code to update b according to standard sequential predicate 
abstraction rules; the new value of b is shared among all threads. 

local update: Suppose v is local and <p is local or mixed. An assignment to v is visible 
only by the active (executing) thread, so the truth of cp is modified only for the 
active thread. This also is reflected in B: by our stipulation above, the local or 
mixed predicate cp is tracked by the local variable b. Again, sequential predicate 
abstraction rules suffice; the value of b changes only for the active thread. 

notify-all update: Suppose v is shared and cp is mixed. An assignment to v is visible 
to all threads, so the truth of cp is modified for all threads. This is not reflected in B: 
by our stipulation above, the mixed predicate <p is tracked by the local variable b, 
which will be updated only by the active thread. We will solve this problem by (i) 
generating code to update b according to standard sequential predicate abstraction 
rules, and (ii) notifying all passive threads of the modification of the shared variable 
v, so as to allow them to update their local copy of b. 

We write must_notify(v, fa) if the shared variable v affects the mixed predicate fa. 

must -notify (y,fa) = affects(v, (p) A v € Vs A (Loc(<p) D Vl ^ 0) • 

This formula evaluates to true exactly when it is necessary to notify passive threads of 
an update to v. What remains to be discussed in the rest of this section is how notifica- 
tions are implemented in B. 

2.2 Implementing Notify-All Updates 

We pause to recap some terminology and notation from sequential predicate abstrac- 
tion [2]. Given a set E = {(pi, . . . , fan} of predicates tracked by variables {b\, . . . ,b m }, 



an assignment statement st is translated into the following code, in parallel for each 
i e {l,...,m}: 

if F{WP( <t>i,st)) then b t = 1 

else if F{WP\^<l>i,st)) then b t = (1) 

else bi = * . 

Here, * is the nondeterministic choice expression, WP the weakest precondition op- 
erator, and T the operator that strengthens an arbitrary predicate to a disjunction of 
cubes over the bi. For example, with predicate <f> :: (I < 10) tracked by variable b, 
E = {(/)} and statement st :: we obtain F{ WP(<p, st)) = T(l < 9) = false and 
F(WP(-^<j>,st)) = J 7 (I >= 9) = (I >= 10) = -4, so that (1) reduces to 

b = ( b ? * : ) . 

In general, (1) is often abbreviated using the assignment 

b t = choose(T(WP(4> h st)),T{WP{^4>i,st))), 

where choose(x,y) returns 1 if x evaluates to true, if (~^x) A y evaluates to true, 
and * otherwise. Abstraction of control flow guards uses the Q operator, which is dual 
to T: Q{4>) = -n^Ktf)). 

Returning to symmetry-aware predicate abstraction, if must _notify (v , <f>) evaluates 
to true for <j> and v, predicate cf> is mixed and thus tracked in B by some local Boolean 
variable, say b. Predicate-abstracting an assignment of the form v = \ requires updat- 
ing the active thread's copy of b, as well as broadcasting an instruction to all passive 
threads to update their copy of b, in view of the new value of v. This is implemented us- 
ing two assignments, which are executed in parallel. The first assignment is as follows: 

b=choose(T( WP(4>, v = x)),H WP(^4>, v = X ))) ■ (2) 

This assignment has standard predicate abstraction semantics: 1 variable b of the active 
thread is updated by computing the weakest precondition of predicate <fi and its negation 
with respect to the statement v = x< a PPly m g tne strengthening operator T to make 
the precondition expressible over the existing predicates, and by applying the choose 
operator, which may introduce nondeterminism. Note that, since expression x involves 
only local variables of the active thread and shared variables, only predicates over those 
variables are involved in the defining expression for b. 

The second assignment looks similar, but introduces a new symbol: 

[b] = choose(T( WP{[4>lv = X )),H WP{-M],v = X ))) ■ (3) 

The notation [b] stands for the copy of local variable b owned by some passive thread. 
Similarly, [<j>] stands for the expression defining predicate (f>, but with every local vari- 
able occurring in the expression replaced by the copy owned by the passive thread; this 
is the predicate <j> in the context of the passive thread. Weakest precondition compu- 
tation is with respect to [<f>], while the assignment v = x> as an argument to WP, is 

' Our presentation is in terms of the Cartesian abstraction [8], as used in [2], but our method in 
general is independent of the abstraction used. 



unchanged: v is shared, and local variables appearing in the defining expression \ must 
be interpreted as local variables of the active thread. Assignment (3) has the effect of 
updating variable b in every passive thread. We call Boolean programs involving as- 
signments of the form [b] = ... Boolean broadcast programs; a formal syntax and 
semantics for such programs is given in Appendix A. 

Let us illustrate the above technique using a canonical example: consider the as- 
signment s = I, for shared and local variables s and I, and define the mixed predicate 
<f> :: (s == I). The first part of the above parallel assignment simplifies to b = true. 
For the second part, we obtain: 

[6] = choose(T( WP(s==[l],s=l)),J : ( WP(->(s==[l]), s=l))) . 

Computing weakest preconditions, this reduces to: 

[b] = choose{T{l==[l}),T{-^{l==[l]))). 

Precision of the Abstraction. To evaluate this expression further, we have to decide on 
the set of predicates available to the T operator to express the preconditions. If this set 
includes only predicates over the shared variables and the local variables of the passive 
thread that owns [b], the predicate I == [I] is not expressible and must be strength- 
ened to false. The above assignment then simplifies to [b] = choose (false, false), i.e. 
[b] = *. The mixed predicates owned by passive threads are essentially invalidated 
when the active thread modifies a shared variable occurring in such predicates, result- 
ing in a very imprecise abstraction. 

We can exploit information stored in predicates local to other threads, to increase the 
precision of the abstraction. For maximum precision one could make all other threads' 
predicates available to the strengthening operator T. This happens in the symmetry- 
oblivious approach sketched in the Introduction, where local and mixed predicates are 
physically replicated and declared at the global scope and can thus be made available 
to T. Not surprisingly, in practice, replicating predicates in this way renders the ab- 
straction prohibitively expensive. We analyze this experimentally in Section 5. 

A compromise which we have found to work well in practice (again, demonstrated 
in Section 5) is to equip operator T with all shared predicates, all predicates of the 
passive thread owning [b], and also predicates of the active thread. This arrangement is 
intuitive since the update of a passive thread's local variable [b] is due to an assignment 
performed by some active thread. Applying this compromise to our canonical example: 
if both s == [I] and s == I evaluate to true before the assignment s=l, we can con- 
clude that [I] == I before the assignment, and hence s == [I] after the assignment. 
Using © to denote exclusive-or, the assignment to [b] becomes: 

[b] = choose([b] A b, [b] ©6) . 

2.3 The Predicate Abstraction Algorithm 

We now show how our technique for soundly handling mixed predicates is used in 
an algorithm for predicate abstracting C-like programs. To present the algorithm com- 
pactly, we assume a language with three types of statement: assignments, nondetermin- 
istic gotos, and assumptions. Control-flow can be modelled via a combination of gotos 
and assumes, in the standard way. 



Algorithm 1 Predicate abstraction 



Input: Program template P, set of predicates {fa , . . . , fan 
Output: Boolean program IB over variables 61, . . . , b m 



} 



for each statement d: stmt of P do 
if stmt is v = ip then 

{ii, . . . ,if} 4— {i I 1 < i < m A qffects(v, fa)} 

{jii • ■ • . jg} <- {j I 1 < i < m A must .notify (v, fa)} 

fhA ( choose{T{ WP(fa x , u=V0),.F( , «=V>))) ' 



5: 



output c!: 



V 6« f / V choose^ H/P(0 v . , v=i}>)),F{ WP{^fa f , v=i>))) , 
/[b n ]\ I choose (FiWPdfa^v^^iWP^lfa^v^)))' 



\ [b jg } ) \ choose(F{ WP([fa g },v=i>)),T( WPHfa a } ,v=i>))) , 
else if stmt is goto di, . . . ,d m then 

output d: goto di, . . . , d m ; 
else if stmt is assume <fi then 
output d: assume Q{fa)\ 



Algorithm 1 processes an input program template of this form and outputs a corre- 
sponding Boolean broadcast program template. Statements goto and assume are han- 
dled as in standard predicate abstraction: the former are left unchanged, while the latter 
are translated directly except that the guard of an assume statement is expressed over 
Boolean program variables using the Q operator (see Section 2.2). 

The interesting part of the algorithm for us is the translation of assignment state- 
ments. For each assignment, a corresponding parallel assignment to Boolean program 
variables is generated. The affects and mustjnotify predicates are used to decide for 
which Boolean variables regular and broadcast assignments are required, respectively. 



3 Symmetry- Aware Predicate Abstraction with Aliasing 

In Section 2 we outlined our novel predicate abstraction technique, ignoring complica- 
tions introduced by pointers and aliasing. We now explain how symmetry-aware pred- 
icate abstraction is realized in practice, for C programs that manipulate pointers. We 
impose one restriction: we do not consider programs where a shared pointer variable, or 
a pointer variable local to thread i, can point to a variable local to thread j (with j ^ i). 
This scenario arises only when a thread copies the address of a stack or thread-local 
variable to the shared state. This unusual programming style allows thread i to directly 
modify the local state of thread j at the C program level, breaking the asynchronous 
model of computation assumed by our method. 

For ease of presentation we consider the scenario where program variables either 
have a base type (e.g. int or float), or pointer type (e.g. int* or float* *). Our method 
can be extended to handle records, arrays and heap-allocated memory in a straightfor- 



ward but laborious manner. As in [2], we also assume that input programs have been 
processed so that 1-values involve at most one pointer dereference. 

Alias information is important in deciding, once and for all, whether predicates 
should be classed as local, mixed or shared. For example, let p be a local variable 
of type int*, and consider predicate <fr :: (*p == 1). Clearly <j> is not shared since it 
depends on local variable p. Whether <fi should be regarded as a local or mixed predicate 
depends on whether p may point to the shared state: we regard <f> as local if p can 
never point to a shared variable, otherwise <j> is classed as mixed. Alias information also 
lets us determine whether a variable update may affect the truth of a given predicate, 
and whether it is necessary to notify other threads of this update. We now show how 
these intuitions can be formally integrated with our predicate abstraction technique. 
This involves suitably refining the notions of local, shared and mixed predicates, and the 
definitions of affects and must_notify introduced in Section 2 and used by Algorithm 1 . 

3.1 Aliasing, Locations of Expressions, and Targets of 1-values 

We assume the existence of a sound pointer alias analysis for concurrent programs, 
e.g. [9], which we treat as a black box. This procedure conservatively tells us whether 
a shared variable with pointer type may refer to a local variable. As discussed at the 
start of Section 3, we reject programs where this is the case. 2 Otherwise, for a program 
template P over variables V, alias analysis yields a relation C V x V for each 
program location d. For v,w S V, if v hA^ w then v provably does not point to w at d. 

For an expression cf) and program point d, we write loc(<p, d) for the set of variables 
that it may be necessary to access in order to evaluate <f> at d, during an arbitrary program 
run. The precision of loc(<j>, d) is directly related to the precision of alias analysis. 

Definition 2 For a constant value z, v G V and k > 0, we define: 
loc(z,d) = ® loc(&v,d) = $ loc(v, d) = {v} 
loc(*..^*v, d) = {v} U {J w ev{ loc (^- ■ w ' d ) \v ^ d w} 



For other compound expressions, loc(<p, d) is defined recursively in the obvious way. 

Definition 2 captures the fact that evaluating a pointer dereference *v involves reading 
both v and the variable to which v points, while evaluating an "address-of" expression, 
&v, requires no variable accesses: addresses of variables are fixed at compile time. 

Definition 3 For an expression (f>, Loc((f>) is the set of variables that may need to be 
accessed to evaluate <f> at an arbitrary program point during an arbitrary program run: 



k 



fe-1 




l<d<fe 



Note how this definition of Loc generalizes that used in Section 2. 



2 This also eliminates the possibility of thread i pointing to variables in thread j ^ i: the address 
of a variable in thread j would have to be communicated to thread i via a shared variable. 



Definition 4 We write targets(x, d) for the set of variables that may be modified by 
writing to l-value x at program point d: 

targets(v, d) = {v} targets( *v, d) = {w G V \ v i—^ w} 

Note that we have targets(*v, d) 7^ loc(*v, d). This is because writing through *v 
modifies only the variable to which v points, while reading the value of *v involves 
reading the value of v, to determine which variable w is pointed to by v, and then 
reading the value of w. 

3.2 Shared, Local and Mixed Predicates in the Presence of Aliasing 

In the presence of pointers, we define the notion of a predicate </> being shared, local, 
or mixed exactly as in Section 2.1, only with the generalization of hoc presented in 
Definition 3. In Section 2.1, without pointers, we could classify (j> purely syntactically, 
based on whether any shared variables appear in <fi. In the presence of pointers, we must 
classify <fi with respect to alias information; our definition of hoc takes care of this. 

Recall from Section 2.1 that we defined affects(v, <fi) = (v G Loc((f>)) to indicate 
that updating variable v may affect the truth of predicate <\>. In the presence of pointers, 
this definition no longer suffices. The truth of <fi may be affected by assigning to l-value 
x if x may alias some variable on which (j> depends. Whether this is the case depends 
on the program point at which the update occurs. Definitions 2 and 4 of loc and targets 
allow us to express this: 

affects(x, 4>, d) = (targets(x, d) n loc((j>, d) ^ 0). 

We also need to determine whether an update affects the truth of a predicate only 
for the thread executing the update, or for all threads. The definition of must-notify 
presented in Section 2.1 needs to be adapted to take aliasing into account. At first sight, 
it seems that we must simply parameterise affects according to program location, and 
replace the conjunct v G Vs with the condition that x may target some shared variable: 

must_notify(x, <p, d) = affects(x 7 </>, d) A (Loc(cj)) (~1 Vl ^ 0) 
A (targets(x,d) n V s ^ 0) . 

However, this is unnecessarily strict. We can refine the above definition to minimise the 
extent to which notifications are required, as follows: 

must -notify (x, (f), d) = (targets (x, d) n Loc(4>) D Vs ^ 0) A (Loc(<j>) D Vl ^ 0) ■ 

The refined definition avoids the need for thread notification in the following sce- 
nario. Suppose we have shared variables s and t, local variable I, local pointer variable 
p, and predicate <fi :: (s > I). Consider an assignment to *p at program point d. Sup- 
pose that alias analysis tells us exactly p t— ^ t and p n>d I. The only shared variable 
that can be modified by assigning through *p at program point d is t, and the truth of 
<t> does not depend on t. Thus the assignment does not require a "notify-all" with re- 
spect to 4>. Working through the definitions, we find that our refinement of must-notify 



correctly determines this, while the direct extension of must .notify from Section 2.1 
would lead to an unnecessary "notify-all". 

The predicate abstraction algorithm, Algorithm 1, can now be adapted to handle 
pointers: parameter d is simply added to the uses of affects and must-notify. Handling 
of pointers in weakest preconditions works as in standard predicate abstraction [2], 
using Morris's general axiom of assignment [10]. 

4 Closing the CEGAR Loop 

So far we have presented a novel technique for predicate-abstracting symmetric con- 
current programs. We have integrated our method with the SatAbs CEGAR-based 
verifier [11], using the Cartesian abstraction method [8] and the maximum cube length 
approximation [2]. We now sketch how we have adapted the other phases of the CE- 
GAR loop: model checking, simulation and refinement, to accurately handle concur- 
rency; a detailed description of the entire process is left to an extended version of this 
paper. 

Model checking Boolean broadcast programs. Our predicate abstraction technique 
generates a concurrent Boolean broadcast program. The extended syntax and semantics 
for broadcasts mean that we cannot simply use existing concurrent Boolean program 
model checkers, such as BOOM [12] or Boppo [13], for the model checking phase of 
the CEGAR loop. We have implemented a prototype extension of Boom, which we call 
B-Boom. B-Boom extends the counter abstraction-based symmetry reduction capa- 
bilities of Boom [5] to support broadcast operations. Symbolic image computation for 
broadcast assignments is significantly more expensive than image computation for stan- 
dard assignments. In the context of Boom it involves 1) converting states from counter 
representation to a form where the individual local states of threads are stored using 
distinct BDD variables, 2) computing the intersection of n — 1 successor states, one 
for each inactive thread paired with the active thread, and 3) transforming the result- 
ing state representation back to counter form using Shannon expansion. The expense of 
image computation for broadcasts motivates the careful analysis we have presented in 
Sections 2 and 3 for determining tight conditions under which broadcasts are required. 

Simulation. To determine the authenticity of abstract error traces reported by B-Boom 
we have extended the SatAbs simulator. The existing simulator extracts the control 
flow from the trace. This is mapped back to the original C program and translated into 
a propositional formula (using standard techniques such as single static assignment 
conversion and bitvector interpretation of variables). The error is spurious exactly if 
this formula is unsatisfiable. In the concurrent case, the control flow information of an 
abstract trace includes which thread executes actively in each step. We have extended 
the simulator so that each local variable involved in a step is replaced by a fresh indexed 
version, indicating the executing thread that owns the variable. The result is a program 
trace over the replicated C program P", which can be directly checked using a SAT 
solver. 

As an example, suppose function / from program P" (Introduction) is executed by 
2 threads (for this example, we ignore the rest of P"). The model checker may return 



the error trace shown below on the left, which is converted into the non-threaded form 
shown on the right. 



Tl 
Tl 

T2 
T2 



++s, 

assert s 

++s, ++Z; 
assert s 



1; 
I; 



++s, ++h; 
assert s 
++s, ++l 2 ; 
assert s 



hi 



The trace on the right is translated into SSA form and finally into the integer arithmetic 
formula below, which is shown to be satisfiable, so the error is real. 



A Z? = A ll = A s 1 



A a* 



+ 1 A l{ = Z? + 1 

1 + 1 A l\ = l° 2 + 1 A s V 4 , 



Note that broadcast operations do not affect simulation: although, at the Boolean 
program level, a broadcast may change the state of multiple threads simultaneously, the 
corresponding C program statement is simply an update to a shared variable executed 
by a single thread. 

Refinement. Our implementation performs refinement by extracting new predicates 
from counterexamples via weakest precondition calculations. This standard method re- 
quires a small modification in our context: weakest precondition calculations generate 
predicates over shared variables, and local variables of specific threads. For example, if 
thread 1 branches according to a condition such as I < s, where I and s are local and 
shared, respectively, weakest precondition calculations generate the predicate Zi < s, 
where l\ is thread l's copy of I. Because our predicate abstraction technique works at 
the template program level, we cannot add this predicate directly. Instead, we generalize 
such predicates by removing thread indices. Hence in the above example, we add the 
mixed predicate I < s, for all threads. 

An alternative approach is to refine the abstract transition relation associated with 
the Cartesian abstraction based on infeasible steps in the abstract counterexample [14]. 
We do not yet perform such refinement, due to challenge of correctly refining abstract 
transitions involving broadcast assignments. This involves some subtle issues which 
will require further research to solve. 



5 Experimental Results 

We evaluate the SATABS-based implementation of our techniques using a set of 14 
concurrent C programs. We consider benchmarks where threads synchronize via locks 
(lock-based), or in a lock-free manner via atomic compare-and-swap (cas) or test-and- 
set (tas) instructions. The benchmarks are as follows: 3 

- Increment, Inc./Dec. (lock-based and cas-based) A counter, concurrently incre- 
mented, or incremented and decremented, by multiple threads [15] 

- Prng (lock-based and cas-based) Concurrent pseudorandom number generator [15] 



3 All benchmarks and tools are available online: http : / / www . cprover . org/SAPA 



- Stack (lock-based and cas-based) Thread-safe stack implementation, supporting 
concurrent pushes and pops, adapted from an Open Source IBM implementation 4 
of an algorithm described in [15] 

- Tas Lock, Ticket Lock (tas-based) Concurrent lock implementations [16] 

- FindMax, FindMaxOpt (lock-based and cas-based) Implementations of parallel 
reduction operation [17] to find maximum element in array. FindMax is a basic 
implementation, and FindMaxOpt and optimized version where threads reduce 
communication by computing a partial maximum value locally 

Mixed predicates were required for verification to succeed in all but two bench- 
marks: lock-based Prng, and lock-based Stack. For each benchmark, we consider ver- 
ification of a simple safety property, specified via an assertion. We have also prepared 
a buggy version of each benchmark, where an error is injected into the source code to 
make it possible for this assertion to fail. We refer to correct and buggy versions of our 
benchmarks as safe and unsafe, respectively. 

All experiments are performed on a 3GHz Intel Xeon machine with 40 GB RAM, 
running 64-bit Linux, with separate timeouts of lh for the abstraction and model check- 
ing phases of the CEGAR loop. Predicate abstraction uses a maximum cube length of 3 
for all examples, and MiniSat 2 (compiled with full optimizations) is used for predicate 
abstraction and counterexample simulation. 
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Table 1. Comparison of symmetry-aware and symmetry-oblivious predicate abstraction over our 



benchmarks. For each configuration, the fastest abstraction and model checking times are in bold. 



http : / / amino-cbbs . sourcef orge . net 



Symmetry-aware vs. symmetry-oblivious method. We evaluate the scalability of our 
symmetry-aware predicate abstraction technique (SAPA) by comparing it against the 
symmetry-oblivious predicate abstraction (SOPA) approach described in Section 1, for 
verification of correct versions of our benchmarks. Recall that in SOPA, an 71-thread 
symmetric concurrent program is expanded so that variables for all threads are explic- 
itly duplicated, and n copies of all non-shared predicate are generated. The expanded 
program is then abstracted over the expanded set of predicates, using standard predicate 
abstraction. This yields a Boolean program for each thread; the parallel composition of 
these n Boolean programs is explored by a model checker. Because symmetry is not 
exploited, and no broadcasts are required, any suitable model checker can be used. 
We have tried both standard Boom [5] (without symmetry reduction) and Cadence 
SMV [18] to model check expanded Boolean programs. In all cases, we found Boom 
to be faster than SMV, thus we present results only for Boom. 

Table 1 presents the results of the comparison. For each benchmark and each ap- 
proach we show, for interesting thread counts (including the largest thread count that 
could be verified with each approach), the number of predicates required for verifica- 
tion and the elapsed time for predicate abstraction and model checking. For each con- 
figuration, the fastest abstraction and model checking times are shown in bold. Model 
checking uses standard Boom, without symmetry reduction (SOPA) and B-BOOM, our 
novel extension to Boom discussed in Section 4 (SAPA), respectively. Entries marked 
TO. indicate that a timeout occurred; succeeding cells are then marked 

The results show that our novel SAPA technique significantly outperforms SOPA, 
both in terms of abstraction and model checking time. The former can be attributed to 
the fact that, with SOPA, the number of predicates grows according to the number of 
threads considered, while with SAPA, this is thread count-independent. The latter is due 
to the exploitation of template-level symmetry by B-BOOM. 



Benchmark 


Symmetry -Aware 


Mixed as local 


Mixed iib 


shared 




Safe 


n 


Unsafe 


n 


Safe 


n 


Unsafe 


j 1 


Safe 


n 


Unsafe 


n 


Increment (lock-based) 


safe 


>10 


unsafe 


2 


safe 


>10 


error 


2 


safe 


10 


error 


2 


Incr. (cas-based) 


safe 


7 


unsafe 


2 


safe 


8 


safe 


5 


error 


2 


error 


2 


Incr./Dec. (lock-based) 


safe 




unsafe 


3 


safe 


>10 


safe > 


10 


safe 


>10 


unsafe 


3 


Incr./Dec. (cas-based) 


safe 


4 


unsafe 


3 


safe 


6 


safe 


8 


error 


2 


error 


3 


Tas Lock (tas-based) 


safe 


7 


unsafe 


2 


safe 


8 


error 


2 


error 


2 


error 


2 


Ticket Lock (tas-based) 


safe 


S 


unsafe 


3 


safe 


>10 


unsafe 


3 


safe 




unsafe 


3 


Ping (lock-based) 


safe 


>10 


unsafe 


2 


safe 


>I0 


unsafe 


? 


safe 


>10 


unsafe 


2 


Ping (cas-based) 


safe 


5 


unsafe 


3 


safe 


7 


unsafe 


3 


safe 


6 


unsafe 


3 


FindMax (lock-based) 


safe 


>10 


unsafe 


2 


safe 


>10 


safe > 


10 


safe 


2 


error 


2 


FindMax (cas-based) 


safe 


4 


unsafe 


2 


safe 


5 


safe 


4 


safe 


2 


safe 


1 


FindMaxOpt (lock-based) 


safe 


7 


unsafe 


2 


safe 


7 


safe 


6 


error 


2 


error 


2 


FindMaxOpt (cas-based) 


safe 


5 


unsafe 


1 


safe 


5 


unsafe 


1 


error 


2 


unsafe 


1 


Stack (lock-based) 


safe 


4 


unsafe 


4 


safe 


4 


unsafe 


4 


safe 


4 


unsafe 


4 


Stack (cas) 


safe 


4 


unsafe 


2 


safe 


4 


safe 


6 


safe 


4 


error 


2 



Table 2. Comparison of sound and unsound approaches; incorrect results in bold. 



Comparison with unsound methods. In Section 1, we described two naive solutions 
to the mixed predicate problem: uniformly using local or shared Boolean variables to 
represent mixed predicates, and then performing standard predicate abstraction. We de- 



note these approaches mixed as local and mixed as shared, respectively. Although we 
demonstrated theoretically in Section 1 that both methods are unsound, it is interesting 
to see how they perform in practice. Table 2 shows the results of applying CEGAR- 
based model checking to safe and unsafe versions of our benchmarks, using our sound 
technique, and the unsound mixed as local and mixed as shared approaches. In all cases, 
B-BOOM is used for model checking. For the sound technique, we show the largest 
thread count for which we could prove correctness of each safe benchmark, and the 
smallest thread count for which a bug was revealed in each unsafe benchmark. The 
other columns illustrate how the unsound techniques differ from this, where "error" in- 
dicates a refinement failure: it was not possible extract further predicates from spurious 
counterexamples. Bold entries indicate cases where the unsound approaches produce 
incorrect, or inconclusive results. 5 The number of cases where the unsound approaches 
produce false negatives, or lead to refinement failure, suggest that little confidence can 
be placed in these techniques, even for purposes of falsification. This justifies the more 
sophisticated and, crucially, sound techniques developed in this paper. 



6 Related Work and Conclusion 

There exists a large body of work on the different stages of CEGAR-based program 
analysis. We focus here on the abstraction stage, which is at the heart of this paper. 

Predicate abstraction goes back to the foundational work by Graf and Sai'di [1]. 
It was first presented for sequential programs in a mainstream language (C) by Ball, 
Majumdar, Millstein, Rajamani [2] and implemented as part of the Slam project. We 
have found many of the optimizations suggested by [2] to be useful in our implementa- 
tion as well. Although Slam has had great success in finding real bugs in system-level 
code, we are not aware of any extensions of it to concurrent programs (although this 
option is mentioned by the authors). We attribute this to a large part to the infeasibility, 
at the time, to handle realistic multi-threaded Boolean programs. We believe our own 
work on Boom [5] has made progress in this direction that has made it attractive again 
to address concurrent predicate abstraction. 

We are not aware of other work that presents solutions to the problem of "mixed 
predicates". Some approaches avoid it by syntactically disallowing such predicates, 
e.g. [19], whose authors don't discuss, however, the reasons for (or, indeed, the conse- 
quences of) doing so. In other work, "algorithmic circumstances" may make the treat- 
ment of such predicates unnecessary. The authors of [20], for example, use predicate 
abstraction to finitely represent the environment of a thread in multi-threaded programs. 
The "environment" consists of assumptions on how threads may manipulate the shared 
state of the program, irrespective of their local state. Our case of replicated threads, in 
which mixed predicates would constitute a problem, is only briefly mentioned in [20]. 
In [21], an approach is presented that handles recursive concurrent C programs. The 
abstract transition system of a thread (a pushdown system) is formed over predicates 
that are projected to the global or the local program variables and thus cannot compare 



5 We never expect the unsound techniques to report conclusively that a safe benchmark is unsafe: 
this would require demonstrating a concrete error trace in the original, safe, program. 



"global against local" directly. As we have discussed, some reachability problems can- 
not be solved using such restricted predicates. We conjecture this problem is one of the 
potential causes of non-termination in the algorithm of [21]. 

In conclusion, we mention that building a CEGAR-based verification strategy is a 
tremendous effort, and our work so far can only be the beginning of such effort. We 
have assumed a very strict (and unrealistic) memory model that guarantees atomic- 
ity at the statement level. One can work soundly with the former assumption by pre- 
processing input programs so that the shared state is accessed only via word-length 
reads and writes, ensuring that all computation is performed using local variables. Ex- 
tending our approach to weaker memory models, building on existing work in this 
area [22,23], is future work. Our plans also include a more sophisticated refinement 
strategy, and a more detailed comparison with existing approaches that circumvent the 
mixed-predicates problem using other means. 
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A Boolean Broadcast Programs 



Our approach to symmetry-aware predicate abstraction generates a form of concurrent 
Boolean program. Although languages for such programs are well-known [13,5], the 
existing ones are not quite suited to our purpose. As discussed in the paper, in order to 
handle mixed predicates correctly, we require the facility for a Boolean program thread 
to read and update variables of other threads. As this is fundamental to our approach, 
we now present syntax and semantics for this variant of concurrent Boolean programs 
which we call Boolean broadcast programs. 



prog 


:= shared name = lit; . . . ; name = lit; 
local name = lit; . . . ; name = lit; 


Ival 


:=v | 


[v] 




1: stmt; . . . ; k: stmt; 


expr 


:= Ival 


| lit compound expr 


stmt 


:= Ival, . . . , Ival = expr, . . . , expr 


lit 


:= 


1 | * 




goto pc 1 , . . . , pc d (pcj G {1, . . . , k}) 
assume expr 


name 


:= any 


legal C variable name 



Fig. 1. Syntax for Boolean broadcast programs. 



Syntax for Boolean broadcast programs is specified by the grammar of Figure 1, 
where standard compound expressions are assumed. The language includes standard 
features such as shared and local variables, parallel variable assignments, nondeter- 
ministic goto statements and assume statements (which together can be used to model 
control flow), and the nondeterministic r-value *. 

The novel feature of this language is that it supports assignments of the form [v] 
= expr, which we call broadcasts. When such a broadcast is executed by thread i, it 
causes local variable v to be updated in each passive thread, i.e. in all threads except i. 
In this context, expr ranges over shared variables, local variables of the active thread 
(thread i), and local variables of the passive thread. The latter are distinguished by the 
syntax [v]. We refer to an expression of the form [v] as a passive expression. We refer 
to such an expression as a passive 1-value or passive r-value depending on which side 
of a broadcast it occurs. 

We give semantics for parallel assignments in which all passive 1- values appear after 
non-passive 1-values. For parallel assignments where this is not the case, the semantics 
are defined to be the same as for any suitable rearrangement that enforces this condition. 
In a parallel assignment of the form: 

Ml, ... , U a , [Vl], . . . , [Vb] = ■■ ■ 

we require that the Ui are mutually distinct, the Vi are mutually distinct, and each 
Vi G Vl- We do allow the Uj and Vj to overlap. We also require that passive r- values 
only appear on the right-hand-side of broadcasts (i.e. they cannot be used in standard 
assignments). 

We now formally define semantics for parallel assignments that may contain broad- 
casts. Let IB be a Boolean broadcast program, and n a positive integer. Let V = Vs U Vl 
be the set of shared (Vs) and local (Vl) variable names appearing in B. Define V = 



Vs U Ui<i<n{' 4 I ' e Vl}- This is the full set of variables in B", an n-thread instanti- 
ation of IB. A store for B™ is a mapping a : V —> {0, 1}, assigning a Boolean value to 
each variable. 

For an expression <j), thread indices i and j (1 < i,j < n) and store a, we write 
eval(<p, a) C {0, 1} for the set of possible results which can be obtained by evalu- 
ating cp with respect to active thread i and passive thread j in the context of a. Formally, 
for I £ Vl, s £ Vs and z £ {0, 1}, eval((f>, i,j, a) is defined for simple expressions as 
follows: 

eval(z, i, j, a) = {z} eval(-k,i,j,a) ={0,1} 

eval(l,i,j,a) = {cr(/ 4 )} eval([l], i, j, a) = {cr(Z J )} eval(s, i, j, a) = {cr(s)} 
and extended to compound expressions in the obvious way. 

If (j> has no passive subexpressions then it is clear that, for any threads ji and j'2, 
eval((j>,i,ji,a) = eval((j>,i,j2,<j)\ the passive thread is irrelevant. In this case we 
simply write eval(<f>, i,a). 

For a parallel assignment statement assign, store a and thread id i (1 < i < n), 
we write exec(assign,i, a) for the set of stores that can result from active thread i 
executing statement assign in the context of store a. 

Without loss of generality, we can assume that assign has the following form: 

si, . . . , s a , h,..., h, [mi], [m c ] = fa, . . . , fa, fa, ■ ■ ■ , tpb, Xi, ■ ■ ■ , Xc, 
where a,b,c > 0, s±, . . . , s a £ Vs are mutually distinct shared variables, 1%, . . , , If, £ 
Vl are mutually distinct local variables, and mi, . . . , m c £ Vl are also mutually dis- 
tinct local variables (the lj and m,j may overlap), fa,. . . ,fa and fa, . . . , tpi, are expres- 
sions that do not contain passive subexpressions, and xi, ■ ■ ■ ,Xc are expressions that 
may contain passive subexpressions. 

With this notation, we define: exec(assign, i, a) = 



si 
/' 



yi 

ml' 1 i-» Zt-i.i. 



H- J/6, 
m* +1 h-» z i+ i, c , 

m™ I— > Z n c 



-l 



a;/ £ eval(4>f,i, a), 
y g e eval(ip g ,i,a), 
z J h e eval{xh,i,j,a), 
1 < / < a, 1 < 9 < b, 
l<h<c,l<j<n,i^j 



Thus each store in exec(assign, i, a) is derived from a by setting each variable Sf 
and lg to a value of eval(<fif,i,a) and eval(ip g ,i,a) respectively (1 < / < a, 1 < 
g < fe), and, for each thread j distinct from i, setting variable ml to a value of 

eval(xh,i,j,(?) 0-<h< c). 

This definition makes precise the meaning of the [.] notation used in the paper. The 
definitions of exec and eval can be used to define the transition system associated with 
a Boolean broadcast program in the standard way. 



B Proving P" Correct Requires Mixed Predicates 



Recall program P" from Section 1, which, as we show, we cannot prove correct if 
executed by 1 thread. 

0: shared int r = 
shared int s = 
„_ local int 1=0 

1: ++r; 

2: if (r == 1) then 

3: /(); 

Let E = E r ' s UE l for disjoint sets E r - S and E l of predicates over {r, s} and I, resp.; 
in particular, no predicate refers to both s and I. Suppose / is an invariant of P" express- 
ible over predicates in E such that / => (s==l) is valid. Since every state satisfying 
r==l, s==l is reachable in P", (r==l) A (s==Z) => I is valid. Therefore, for infinitely 
many c S M, the assignment r=l, s=l=c satisfies /, written (r, s, l)=(l, c, c) \= I. 

Let now {Ji, . . . , I w } be the cubes in the DNF representation of /. Since this 
set is finite, there exist two distinct values a, b and some i 6 {1, . . . , w} such that 
both (r, s, Z)=(l, a, a) \= Ii and (r,s,l)=(l,b,b) \= Ii. We split cube li into the 
sub-cubes I\' s and ij that contain the predicates over {r, s} and those over Z, resp.: 
Ii = F ,s A l\. From (r, s, Z)=(l, a, a) \= Ii we conclude (r, s)=(l, a) ^ F' s (I does 
not occur in 7^' s ). Similarly, from (r, s, Z)=(l, 6, 6) |= we conclude l=b \= l\. Hence, 
(r, s, Z)=(l, a, b) (= Ii, hence (r, s, l)=(l, a, b) \= I, which contradicts the validity of 
/ => (s==l) since a^fc. □ 



/() 



++s, 

assert s 
goto 4; 



I; 



